Skip to content

[SPARK-27961][SQL] DataSourceV2Relation should not have refresh method#24815

Closed
gengliangwang wants to merge 3 commits intoapache:masterfrom
gengliangwang:removeRefreshTable
Closed

[SPARK-27961][SQL] DataSourceV2Relation should not have refresh method#24815
gengliangwang wants to merge 3 commits intoapache:masterfrom
gengliangwang:removeRefreshTable

Conversation

@gengliangwang
Copy link
Member

@gengliangwang gengliangwang commented Jun 6, 2019

What changes were proposed in this pull request?

The newly added Refresh method in PR #24401 prevented the work of moving DataSourceV2Relation into catalyst. It calls case table: FileTable => table.fileIndex.refresh() while FileTable belongs to sql/core.

More importantly, Ryan Blue pointed out DataSourceV2Relation is immutable by design, it should not have refresh method.

How was this patch tested?

Unit test

@gengliangwang
Copy link
Member Author

@rdblue @jzhuge @gatorsmile

@rdblue
Copy link
Contributor

rdblue commented Jun 6, 2019

@gengliangwang, the catalog interface has a method to invalidate a source table. Do you want to implement REFRESH TABLE for v2 tables as a follow-up?

@rdblue
Copy link
Contributor

rdblue commented Jun 6, 2019

+1 from me if tests pass. (I saw the comment about the Paruqet -> ORC fix on the previous PR.)

@gengliangwang
Copy link
Member Author

@gengliangwang, the catalog interface has a method to invalidate a source table. Do you want to implement REFRESH TABLE for v2 tables as a follow-up?

Yes, I will create a follow-up. This one is to unblock the work of moving DataSourceV2Relation into catalyst.

@gengliangwang gengliangwang changed the title [WIP][SPARK-27961][SQL] DataSourceV2Relation should not have refresh method [SPARK-27961][SQL] DataSourceV2Relation should not have refresh method Jun 6, 2019
@dongjoon-hyun
Copy link
Member

@gengliangwang . Could you update the PR description a little bit? me is you or John?

Copy link
Member

@dongjoon-hyun dongjoon-hyun left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Logically, this reverts [SPARK-27504][SQL] File source V2: support refreshing metadata cache.

I have two questions.

cc @gatorsmile

@@ -41,7 +41,7 @@ abstract class MetadataCacheSuite extends QueryTest with SharedSQLContext {
test("SPARK-16336,SPARK-27504 Suggest doing table refresh " +
"when encountering FileNotFoundException") {
Copy link
Member

@dongjoon-hyun dongjoon-hyun Jun 6, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The test case name Suggest doing table refresh is not valid anymore DSv2 after this PR. Maybe, recreating is better?

@gengliangwang
Copy link
Member Author

gengliangwang commented Jun 6, 2019

@dongjoon-hyun I think we can partially revert it. Some of the works are still valuable:

  1. Showing warning message in file source v2 if the underlying files have been updated
  2. The test suite is reserved. Otherwise, we can see test failure in the Parquet V2 migration PR.

@SparkQA
Copy link

SparkQA commented Jun 6, 2019

Test build #106248 has finished for PR 24815 at commit 8ab2ae5.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@SparkQA
Copy link

SparkQA commented Jun 6, 2019

Test build #106252 has finished for PR 24815 at commit 626127d.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@SparkQA
Copy link

SparkQA commented Jun 7, 2019

Test build #106267 has finished for PR 24815 at commit 98c6105.

  • This patch fails due to an unknown error code, -9.
  • This patch merges cleanly.
  • This patch adds no public classes.

@gengliangwang
Copy link
Member Author

retest this please.

@SparkQA
Copy link

SparkQA commented Jun 7, 2019

Test build #106271 has finished for PR 24815 at commit 98c6105.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@gengliangwang
Copy link
Member Author

@dongjoon-hyun Any other concerns on this PR?

@dongjoon-hyun
Copy link
Member

dongjoon-hyun commented Jun 8, 2019

Sorry for the delay, @gengliangwang . Yes, for now, this might be the less noisy path because SPARK-27504 was merged long time ago.

@dongjoon-hyun
Copy link
Member

Retest this please.

@SparkQA
Copy link

SparkQA commented Jun 8, 2019

Test build #106296 has finished for PR 24815 at commit 98c6105.

  • This patch fails due to an unknown error code, -9.
  • This patch merges cleanly.
  • This patch adds no public classes.

@dongjoon-hyun
Copy link
Member

Retest this please.

@SparkQA
Copy link

SparkQA commented Jun 8, 2019

Test build #106300 has finished for PR 24815 at commit 98c6105.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

Copy link
Member

@dongjoon-hyun dongjoon-hyun left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

+1, LGTM. Merged to master.
cc @gatorsmile since this PR has a history and the context.

@jzhuge
Copy link
Member

jzhuge commented Jun 9, 2019

Thanks @gengliangwang @rdblue @dongjoon-hyun for the quick resolution!

emanuelebardelli pushed a commit to emanuelebardelli/spark that referenced this pull request Jun 15, 2019
## What changes were proposed in this pull request?

The newly added `Refresh` method in PR apache#24401 prevented the work of moving DataSourceV2Relation into catalyst. It calls `case table: FileTable => table.fileIndex.refresh()` while `FileTable` belongs to sql/core.

More importantly, Ryan Blue pointed out DataSourceV2Relation is immutable by design, it should not have refresh method.

## How was this patch tested?

Unit test

Closes apache#24815 from gengliangwang/removeRefreshTable.

Authored-by: Gengliang Wang <gengliang.wang@databricks.com>
Signed-off-by: Dongjoon Hyun <dhyun@apple.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants